DOCOBBDt BESOBC 



ID 176 589 

IITLB 

IHSXITOIIOii 
SPOHS IGBNCX 

IVIILABLE F£OH 



Ems PRICE 
DBSCfilPTOR S 



^ Fl 010 823 

aarViSt Gilbert A,; Ada is, Shirley 0, 
Evaluating a second Language ^Prograi, language in 



19* ^ 
Linguistics, 



IDENTIFIEBS 



•^STBACT 



Education: Theory and Practice, No. 
ERIC Clear ipg house on Languages and 
Arlington, ?a, 

National Inst, of Education (DHE«) , Washington, 

Sep 79 
'♦Up, 

Center for Applied Linguistics, 1611 Kent Street, 
Arlington, Virginia 22209 ($2.95) 

KF01/PC02 Pxus Postage, 

Classrooi Observation Techniques; Data Analysis; Data 
Collection; *ldacationaX Objectives; *E valuation 
Methods; Guidelines; Instructional Materials; 
Language instruction; *La:nguage prograasi Language 
Tests; Heasureient Instruaents; *Prdgrao ivaluation; 
♦Second Language Learning; Student Behavior; ♦Student 
Evaluation; Teacher Behavior 
Infpraation Analysis Products 



This introduction to second language grograa 
> aluation begins with a rationale outlining the lajor issues 
involved in program evaluation. Procedures for evaluation are 
described, and examples are offered of goal stateaents^ sub-goal 
statements, curriculu* obj[ectives, and learning objectives. 
Guidelines for judging a statement of aiKs are listed,^ Procedures are 
set forth for selecting measurement instruments (including discussion 
of several specific instruments), and for collecting and analyzing 
data« Evaluation of classroom tehavior and of instructional materials 
is outlined .under the headings of student population, program scope, 
administrative considerations, methodology /classrooi activities, 
teacher/student behavior, testing, materials, and facilities. 
Suggestions are offered for writing evaluation reports. Appendix A 
describes principal evaluation models, and Appendix B discusses two 
specific observation instrxsments, (OB) 



♦ Reproductions supplied by BDRS are the best that can be made ♦ 

♦ from the original document, * 



LANGUAGE IN EDUCATION: THEORY AND PRACTICE 

19 



Evaluating a Second Language Program 



TO THE EDUCATIONAL F^ESOUTOES 
INFORMATION CENTER (ERlC>;^ 



Hihlished by 

Center for Applied Linguistics 
Prepax'ed by 



Gilbert 

Shirley Adams 



^^PERMtSSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



M V 01 ^AUTMCNT Ol^ HtAttH. 
lOU€4tlON * WCt^AHC 
HATIOHAt INStlTUTe OF 




THIS OOCUM€Nt HAS ftCCN R€f»RO- 
OUC€D EXAtTLV AS H€Cf IVCO FROM 

AtiNO POINTS OF VICWOII OFtNlONS 
STAtCO DO NOT NCCtSSAftlLY 
SCt^t OFFICIAU WAtlONAt INSTITUTC Of 
EOUCATtON f>0$lTlON 0« POUQ'f 




Languages and. Linguistics 



Language in Education: Theory and Practice 
Series ISBN: 87281-092-5 

ISBN: 87281-105-0 

September 1979 
Copyright <g) 1979 

By the Center for Applied Linguistics 
1611 North Kent Street 
Arlington, Virginia 22209 



Printed in the U.S.A. 



LANGUAGH IN l-DUCATION: THEORY AND PRACTICE 

iklC (Educational Redburces Information Center) is a nationwide network 
o7 information centei^ each responsible for a givon educational level or 
field of study. ERIC Js supported by the Nhtional Institute of bducation 

• of the U,S. Department of Health, EducationUnd Welfare. The basic 
objective of ERIC is to" make current develofW^nts in educational research, 
instruction, and personnel preparation mor^ readily accessible to educa- 

t) tors and members cf related professions* 

' ERIC/CLL, The ERIC Clearinghouse on Languages and linguistics (ERIC/CLL), 
one of the specialized clearinghouses in the EftlC system, is operared .by 
the Center for Applied Linguistics. EMC/GLL is specifically rospon- 
sible for the collection and dissemination of information m the general 
area of research and application in languages, linguistics, and language 
teaching and learning. 

T^ANGUAGE IN EDUCATION: THEORY AND^PRACTICE . In addition to processing 
information, ERIC/CLL is also involved in information synthesis and 
analysis. The Clearinghouse commissions recognised authorities m lan- 
-guages and linguistics to write analyses of the current- issue.^ m thexr 
areas of specialty. The resultant document;s, intended for use by educa- 
tors and researchers, are published under tlie title Language m Education: 
Theory and Practice.* The series includes pmctical guides for classroom 
teachers, ex'tensive state-of-the-art papers, and selected bibliographies. 

The material in this publication was prepared pursuant to a contract with 
the National Institute of Education, U.S. Department of Health, Education 
and Welfare. Contractors undertaking such projects under Government 
sponsorship are encouraged to express freely their judgment m profes- 
sional and technical matters. 'Prior to publication, the manuscript was- 
submitted to the American Council on the Teaching of Foreign Languages 
for critical review and deteiroination of professional competence. Ih s 
publication has met such standards. Points of view or opinions, however, 
do not necessarily represent the official^ view or opinions of either 
ACTFL or NIE. This publication is not printed at the expense ot the 
Federal Government. . 

This publicatioj may be purchased directly from the Center for Applied 
Linguistics. It also will be announced in the ERIC ™o"fhly abstract 
journal Resou rces in Education (RIE) and will be available from the ERIC 
Document Reproduction service. Computer Microfilm International Corp., . 
P.O. Box 190, Arlington, VA 22210. See RIE for ordering, information and 
ED number. 

For further information on the ERIC system, ERIC/CLL, and Center/Clearings 
house Dublications, write to ERIC Clearinghouse on Languages and Linguis- 
tics, Center for Applied Linguistics, 1611 North Kent Street, Arlington, 
VA 22209.' 

♦From 19?4 thvow^h '-97?, all Clmrinnhomc publiaations appaamd as the 
CAL*ERrC/aLL Series on Lmguag&s and Lingid&Uas, AWioiign "^^P^'<^^^ 
are-heinq added to ihe oHgirial s^riks^ ths majority oj the ,^RjC/ClL 
information analysis products be includ,id tn tlw LwHrnage vn Ldum- 

• tion series, 

if 

mc ■ • , 4 



Introduction 



Many second language educators are uncomfortable with 
program evaluation. Yet, instructors at all levels of 
education and in all subject matter areas are now partici- 
pating inci;easingly in the evaluation of their programs. 
In many instances they are being asked to plan and. conduct 
formal reviews of their programs. Several factors seem 
to be converging to cause program evaluation to become an 
even more visible part of the educational process in the 
years sdiead. Most notably, education is becoming more 
expensive each year» Its quality is becoming a matter of 
great public concern, and, at the same time, the body of 
knowledge dealing with educational evaluation is becoming 
increasingly sophisticated. 

Language teachers in the United States are also experi- „ 
encing the effects of several additional factors. The 
status of language study in the total context of our edu- 
cational enterprise has not been 'favorable recently. At 
no level .of education are language programs viewed as 
essential or basic to the quality of a person ♦s education. 
We, as language teachers, have never generated a broad bas^ 
of sul)port. Public opinion polls repeatedly document our 
last-place status in public acceptance. Significant doubt 
about tbe quality of our programs is one of several factors 
that have contributed to tj^s situation. ("They don't 
teach you^'the real language that everyone speaks,") Our 
history is not one of careful, dispassionate inquiry 
designed to determine optimum teaching strategies. We 
have, instead,! a conspicuous history of conflicting meth- 
odologies and Ideologies, We have, moreover, often prom- 
i-sed greater proficieiv:^ Could teach, and we have 

in many programs restricted our instruction to 'an elite 
segment of the school population. Factors such as these . 
are responsible for a special type of vulnerability— 



vulnerability that sometimes becomes manifest during eval- 
uation procedures • 

The notion of evaluation often creates feelings of doubt, 
apprehension, and antipathy among many language teachers* 
Evaluation has been threatening^^ — both to^ their egos and 
^to their careers. Unfortunately, evaluation in the mid- 
1970s has led more often to hostile decisions by adminis-^ 
trators than to better education for everyone* In the 
worst cases, evaluation is reduced to a pretext for 
eliminating programs when the decision to eliminate has 
already been made though not promulgated* The classic 
case is that of the administrator who wants to reduce or 
eliminate the program an4 who structures the evaluation , 
so that it accentuates weaknesses in the program* 

Tlie climate for evaluation has been further worsened by 
confusion between program evaluation and instructor eval- 
uation* Program evaluation is not a euphemism for evalu- 
at in]^ ins true tors* Instructors are only one component of 
the instructional process, and evaluation of instructional 
programs must involve all aspects of that process* 

One particular type of evaluation, accreditation evalua- 
tion, is most familiar to teachers* Tlieir experience 
with this type of evaluation does not, however, generally 
contribute to its perception as a means> for improving 
^instruction^ Usually, it is viewed Vs an inevitable 
nuisance ^required By a remote and nebulous agency; it is 
an exercise to be done so that it can be forgotten for 
another ten years* 

In an effort to acconmiodate the backgrounds of language 
educator^ who confront evaluation for the first time, the , 
purposes of this paper are (1) to summarize important 
thinking and issues in evaluation as they relate to second 
language programs, (2) to describe procedures for evalu- 
ating a second language program, and (3) to describe pro- 
cedures for analy2ing evaluation data. The reader should 
recognize that because evaluation is a large and complex 
process, a brief paper such as this must omit many of its 
dimensions a^d aspects; moreover, every program is unique, 
and examples are** of only limited utility* This document 
should therefore be considered a primer of second lan- 
guage program evaluation # 




Evaluation activity must be premised on the conviction . 
that conscientious > honest evaluation can lead to bettor 
programs and therefore to a more significant role for 
language educators f^i "the. total educational process. 
Evaluation should not be a distasteful or threatening 
task; it is a means to professional self -understanding 
and self^improvement. In the final analysis, however, it 
is only the befxavior of educators that can transform these 
words from hollow echoes to more ant^ better programs. 

What Is Evaluation? 

vJ ■ 

Evalucition can be an ambiguous tornu On the an6,hand^ we 
recognise it as an everyday activity ih all our lives* 
We evaluiite when we shop^ when wo turn on a television 
sot and select a station, when we choose driving routes 
to work, or when we dress for a particular occasion* We 
iiiay even be aware that o&ch of us does not perforin all 
thes^ evaluations equally well; some are successful^ but 
at other times we have to acknowledge that we have used 
bad judgment or made poor decisions* On the other hand, 
when we read the professional evaluation literature, 
evaluation is transformed into an apparently obscure 
process shrouded in technical jargon and concepts* In 
truth, we probably oversimplify our ^Uay^' evaluation 
activity* *iyior et,al. (1967) note that educational 
evaluation must always take into account the full complex- 
ity of the phenomenon: 

Educational programs ajre characterized by their 
purposes, their content, their env ironment§, 
their methods, and the changes they bring about. 
Usually there are inessages to be conveyed, rela- 
tionships to be demonstrated, cqncepts to be 
symbolized, understandings' and skills to be 
acquired* Evaluation is complex because each 
of- the many characteristics requires separate 
attention^ i;i967:4-5) 

The complexity of the teaching-learning process is inten- 
, sified in the sc-ccnd language classroom* In our class- 
rooms many dimensions of tl^e process are in fact different 
or appear different from that of other subject matter 
areas* This is one of the most contpelling reasons for 



involvement— and therefore .competence— in the evaluation • 
process on the p'4rt of language educators, * . 

An evaluator — whether a professional from outside or a 
teacher evaluating his or her own program— may be involved 
in such disparate tasks as judging the worth of a ^pa^tic- 
ular educational goal, determining whether a test does 
indeecl measure the'ability specified by a particular 
instructional objective, determining the prerequisite 
skills or sophistication for a particular inst^fuction^^l 
unit, or comparing a conventional one-hour-per-day time >. 
format to a one-month, all-day intensive program. Evalu- 
ation cannot.be reduced to a simple procedure in which 
one follows a finite series of rigidly prescribed steps, 
nor is it uncomplicated conceptually. Definitions vary 
from" author to author: "an assessment of merit," "deter- • 
mination of attainment of objectives," "procuring infor- 
mation to use in decision making," or "comparisons between 
alternate programs," All the definitions imply certain 
commonalities. They all require systematic efforts to 
define criteria and to obtain accurate information about 
the program characteristics. Evaluation must be charac- 
terized by what Cronbach and Suppes refer to as "disci- 
plined inquiry": 

Disciplined inquiry has a quality that distin- 
guishes it from other sources of opinion ai)d 
belief. The disciplined inquiry is conducted 
and reported in such a way that the argument 
can be painstakingly examined. The report does 
not depend for its appeal on the eloquence of 
the writer or on any surface plausibility. The 
argument is not justified <by anecdotes or casu- 
, ally assembled fragments of evidence . . . , 

The report of a disciplined inquiry has a tex- 
ture that displays the raw materials entering 
the argument and the logical processes by which 
they were compressed and rearranged to make the ' 
conclusion credible , . . , 

Disciplined inquiry does not necessarily follow 
well-established formal procedures. Some of 
the most excellent inquiry is free ranging and 
, speculative in its initial stages, trying what 
might seem to be bizarre combinations of ideas 
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and procedures, T)r restlessly casting about for 
.. ideas . ♦ » , (1969:15-16, 18)- ^ , \^ > 

Cronbach and Suppes verbalize a very important principle 
for the teacher participating in an evaluation. It is 
crucial that the collection of information about the pro- 
gram be as ^ free from bias as is possible ,' The 'informa- 
tion must be gathered dispassionately and organized sys- 
tematically. If the principle of disciplined inquiry 
does not guide the process, evaluation in the true sense « 
of the term cannot occur; ^.iistead, one creates propaganda. 
Thus, if students are interviewed during an evaluation* 
it is essential that they be representative of all the 
students in the program. If enrollment data ar^ described, 
they must tell the "full story." Enrollment figures from 
the first day of class are not valid, for example, if 40 
percent of the students drop the course before the end of 
one. semester. Also, student achievement scores must ^ 
include all.-scores or a representative sample. 

Evaluation shares this quality of disciplined inquiry with 
research* and program development. Earlier conceptualiza- 
tions of evaluation were very narrow. It was made 
synonymousAwith measurement (testing), with professional 
jud^ent, or with comparisons between performance data 
and the objectives of the program. Recent definition^ are 
more ecumenical and pluralistic, stressing both systematic 
procedures and judgment of worth. The conceptualization 
that guides this paper is^ similar to that of Scriven (1967) 
and virtually identical to that of Wurthen and, Sanders 
(1973): "Evaluation iS the determination of the worth of a 
thing. It includes obtaining information for use in judg- 
ing the worth of a program, product, procedure, or objec- 
tive, or the potential utility of alternative approaches 
designed to attain specified objectives" (p. 19) . 

In ^ holistic approach, data must be collected on all 
•variabaes or factors that may reflect the quality of a 
program — whether they, be process A^ariables or product 
variables. These data must be collected as systematically 
as possible. Once the dataware collected, judgments of 
* worth can then be made b^ asking whether the composite < 
data indicate high or low quality. Similarly,, judgments 
can becmade about particular compon€?nts or dimensions .of 
the program. Should th^e particular component be retainedi^ 
modified, or replaced by an alternative arrangement? 



This approach. as suines» furthermore, that a program cannot 
exist in a vacnum. ' A program is influenced by and itself 
influences various persons or groups with differing nfeeds 
aijd points of view. Procedures for carrying out an .eval- ^ • 
uation must tate into account^ the nature of the particular 
program. An. ©valuator must be analytical and circumspe^^t. 

[For the reader interested in the full range of conceptL^^^ 
ali'zation iof .evaluatiom, a brief summary of general models . 
is given in Appendix A.] 



^ Purposes of Evaluation 

useful Vay to conceptualize the purposes of evaluation 
is to dichotomiEe them into formative and BvmnatiDQ eval- _^ 
uatiofi. The .purpose of forir.ative evaluation is to improve, ' 
"the fns^truction.' It asks, in effect, about the currv<?nt 
statu* *0-f the program so that it can be made better. It is 
.evaluation -that is" carried out during the development, \ 
implemlentation,' and operation of the program. Summative 
^aluatj^n is terminal evaluation of a program that is * > 
already operational. Its purpose is to make judgments 
about the program's worth.. One can also luse it to deter- 
mine which evaluative labels to place upon the program 
("Gsieat," '.'Good," "Mediocre," etc.). Ultimately, summative 
evaluation- is tied to decisions about support and continua- 
*tion*of a program, x 

These two purposes are not mutually exclusive, A summa- 
tive evaluation can subsequently be used for formative 
purposes. Similarly, a formative evaluation at a given 
moment can lead to answering such questions as whether or ' 
liot enough progress has been ,made in the development of a 
program. 'It should be noted, however, i*at some proce- 
dures are not equally suited for the two types of evalua- 
tion. In a. formative evaluation it is important to obtain 
day-to-day feedback on specific aspects ot the instruc- 
tion. This information is, however, often fragmentary 
and^of li •^♦ed use in a summative evaluation. Conversely, 
an assessment of the ^impact of a- new program on the image 
of language study in 'a particular school may be very, 
important in a summative evaluation but is inappropriate 
for a formative purpose. -« 



Fomative and suiwnative evaluators often do not •'behavjs" 
* in the same way. A formative ©valuator*' s procedures may 
be iftore partisan than a summative evaluator»s approach^ 
The sunroative evaluator roust be obje'ctive and circumspect, 
A fojnroative evaluator can use shortcuts, small samples, 
and intuitdion in an effort to improve the program. 

There, is -a need for more j^ormative evaluation in language jt 
programs. Rarely do we initiate, evaluation for the pur- 
pose jof improving our programs. Even in light of the very 
heavy loads of most instructors*, evaluation is still far ^ 
too iraporjtant an activity to be so neglected. 



Defining th ' e Program ^ 

At first glance, defining a>program may seeim to be an 
unnecessary" task— one that 'is, so obvious that no effort 
is necessary.. In many school's and colleges, h.owev,er, it 
may be very difficult to make' decisions about the scope 
of an evaluation. , *• 

Consider the following ^example of a fictitious high 
school (Central High) . The staff at Central High has 
decided to undertake an evaluation of its second language 
prograan. The school houses 'grades eight thr#ugh twelve, 
though eighth and ninth. grade students are ^classified as 
junior, high students. Four major languages' are taught-- 
Frenclj; Spanish, German, and Latin. The program offers 

, four years of study in each language. In addition, the 
assistant priijcipal teaches eight students Italian twice 
a week, though no credit is. given. Most 'students begin 
studying a language in^the ninth gtade, but Central High 
recently implemented an exploratory, language\t.'ogram for 
junior high stiJdents, as well as*^an immersion program for 
<|ualified students, which' J* offered during the summer by 
temporary staff* hired expressly for that, program. Fur- 
thermore, the social studies department offers courses on 
French and Spanish history, and the English department - 
of fers« iiourses in Europ^San "literature. A language lab is 
shared with the h.usiness department, which uses it to 

, teach shorthand. 

&($fore bejp'inning evaluation procedures, the staff at 
• Central High has some decisions to make. Will the evalu^ 



ation" cover all ine courses in all tho languages offered, 
or is Latin* to bo excluded frora this evaluation? Should 
the exploratory program offered for junior high students ' 
be included, or is it part o'f the Junior high language 
arts prograj^ Should the summer immersion program be 
included in'ihe evalusttion, or is it consider<ld a specifal 
program? Because the sp*ecial courses offered in social- 
studies and "English are often taken in conjunction with" 
'the language studied ^ should these courses- also be 
included in the evaluation? « 

Questions such as these, though SDioetimes very difficult, 
must be artswered to the satisfaction of all participants ^ 
prior to A\c start of an evaluation* Each situation is 
unique, anal there are no fiflrm guidelines to follow. The 
kinds of decWioi^s to be made^on the basis of the evalua- 
tion,, the resources and time available, and the politi<tal - 
iioplications of the various alternatives must all be taken 

into accouftt. 

< C ^ 

Once the scops of the program to be evaluated is deter- 
mined", an accurate description of who and what are involve 
in it can be prepared. This de^riptiye document's erves 
two purposes: • it brings, together all who are interested 
•iti the evaluation, and it clarifi.es in the minds of the 
^val uator s exa c tl y what t hfe program ' s charac t er is t ic s ar e . 
This document' can be brfef, but it should Include, the 
following: " • 

' 1, Demographic information about the- staff (e^.g., 
backgrounds, degrees, experience, etc) 

' 2. Demographic information about the student popu- 

lation 



Description of the course offerings and enrol 1« 
ments 



>4. Relevant historical information (e.g.. Is it a 
, long-standing program or relatively new?) 

5. pescr^.pfion of facilities 

*• . » t> 

6. ^ Pertinent information about the role of the pro- 

gram "in the total offerings of the school 



V^fho Is to Do the Evaluation? 



The Tjackgrounds of those who conduct an evaluation deter- 
inine to some degree the quality and nature of the process. 
An outside evaluation expert will conduct a methodologi- 
cally sound evaluation but may miss important information 
specific to our field. Such evaluations are also often 
restricted in scope and time because of the expense 
ir^volvsd. Evaluation by a foreign language education 
expert may be methodologically less elegant; however, the 
specific competence in our field is a compensating factor. 
This type of evaluation usually also operates under scope 
and time restrictions for reasons of cost. Evaluation 
solely by the staff of a program introduces various kinds 
of biases and school politics. It may be better than no 
evaluation at all, but it is the least attractive option. 

The tost feasible solution to these limitations seems to 
be a combination of talent. A carefully done self-study 
by the program's staff followed by an evaluation by an 
outside e*xpert in language education provides many advajn- 
tages, . (Ideally, the expert is also experienced in eval- 
uation.) It capitalizes on the local, relevant skills, 
yet provides the conscience, fresh outlook, and expertise 
of an outsider. vSuch a procedure facilitates the work of 
the outside 'expert, frequently permitting that person to 
'accomplish his or her evaluation with no more than two 
days spent visiting the school. Thus, expense is reduced 
and quality maintained. 

Evaluating the Goals and Objectives 
of the Program 

The educational aims of a program are a crucial inHial 
consideration in evaluating aify type of instruction. To 
"accept goals without tpiestioning their validity or desir- 
abjllity is to <;oiranit a serious error that reduces the 
remainder of the evaluation to an exercise in irrelevance. 
The goals and objectivesr-are important from three per- 
spective: 

1. * It is largely in terms of goals that one must 
judge the classroom behavior of learners and - 
teachers. 



2. There are more and less desirable ways for for- 
mul-ating goals. The process by which 'they are 
formulated and utilised must itself be judged.. 

The content of the goals and objectives must be 
judged. Is it consisteni with the needs and 
nature of society? Is the content consistent 
with the most up-to-date knowledge of the 
teaching/learning process? Is the* content <jon- 
siscent with the nature of the students? 

Goals cannot be considered an optional component of a 
second language program. They are essential. Their 
inclusion is not the result of any educational fad, nor 
is it a simple matter of responding to societal trends 
like accountability or specificity. Some would claim, 
for example, that we are in an age of specificity and 
accountability and that the use of goals and objectives in 
educational programs is merely one manifestation of that 
phenomenon. Some instructors, moreover, have experienced 
an administrator's fiat that their curriculum must be 
designed around objectives of a particular type (usually 
behavioral) "beginning n^xt Monday." Their attitudes 
are predictably hostile. Tjiis is unfortunate, for educa- 
tion is inherently purposeful. Students have purposes or 
^ims,. and so do teachers. These aims are inevitably 
present whenever two persons come together in teacher and 
student roles. To the degree that their respective pur- 
poses coincide, the educational process will be enhanced. 
But even when' the purposes or objectives do not coincide 
or are not verbali2ed,-they are nevertheless present. 

Statements of objectives serve purposes beyond clarifying 
the'intent of their formulator: they function as a com- 
jijunication device among all groups involved in the educa- 
tional process, vincluding teachers » administrators, 
parents, and other interested parties. There are many 
ways to formulate statements of aims. One important dis- 
t^inctlbn*, however, relates to their level of generality. 
Sl;§tements of aims can b.e in veiy broad, general terms 
or, at the other extreme, in very specific, concrete 
behavioral terras.^ The former are often referred to as 
statanents of "philosophy" in secondary schools and 
"mission" statements in colleges-, and universities. They 
repr^sen-t the' most succinct statements of the raison 



d*$tre for the program] moreover, they represent the 
point of departure *or basis for the formulation of all 
the specific objectives. From a stateroe.j. of goals (the 
term "goals" is frequently applied to this most general 
statement of aims), one can proceed through increasingly 
specific statements of aims in regard to abilities or 
knowledge, ultii^tely reaching very explicit stat^ents of 
behavioral objectives (the texrn "objective" being conven- 
tionally applied tp specific statements of aims). An 
objective may, for example, lead to a specific classroom 
activity on a particular day* The number of stages from 
the goal statement to the most specific objectives is not 
nearly so important as their continuity. It is imperative 
that the specific objectives that guide day-to-day behavior 
add up to the goal of the prbgram. 

As an example of this progressive differentiation of aims, 
one might envision a program consisting of goals, sub- 
goals, end-of-cour^5pobjectives, and learning objectives. 
The broad goal statement is broken down into several 
sub- goals, which, in turn," are broken down into objectives 
for the end of a semester, , quarter, grading period, or 
even an academic year. These end-of-term objectives^re 
further categorized into a large number ^>f learning vv 
* objectives, which guide decisions about the moment- to- \ 
moment behavior in the classroom. 

Schematically, the process is represented by Figure 1. 
Sample content for the lettered boxes is given below. 
As mentioned earlier, such statements must be developed 
locally for each program and cannot be ^'copied" from' any 
source,) 

* 

A. .(Sample Goal Statement) 

Each and every moment ySf one*3 life can be viewed as 
a continuing effort to communicate or to interact 
with one's total environment. Students should there- 
fore have the opportunity to enhance their knowledge 
and ability to communicate apd interact with the 
world that surrounds them. This ability implies 
understanding the communicative process, recognition 
and acceptance of new patterns of thought, awareness 
of one's self and how individuals differ from one 
another, a positive self- concept, an adaptability and 
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Figure 1 

-flexibility in the face of the unfamiliar and change, 
and diverse intellectual skills* The study of a 
foreign language can lead to these outcomes* 

Note that the goal statement is very broad or general 
C'I>hilosophical,^' to* some)* It is also concise in that 
it says a great deal about the aims of the program in 
relatively few words* The content delineates the contri- 
bution that the study of a language can make to the total 
education of a person in the late twentieth century ♦ 
, Goal* statements relate our subject matter to the funda- 
mental purposes of all education* 
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B. (Sample Sub-Goal Statement) 

students will develop the ability to communicate 
orally in the language. 

Sub- goals are one ste^more specific. Each of them 
Represents a segment fff the learning implied in the goal 
- statement. It is more specific than the goal: it focuses 
on oral slciUs, It is not sufficiently specific, how-* 
ever, for making hour-to-hour decisions about classroom 
behavior. 

C. (Sample Curricular Objective) 

Students will be able to ask and answer conversation- 
type questions about the weather in a way that is 
understood and does not irritate native speakers. 

Curricular objectives are one step less general and more 
concrete. These end-of~term or end-of-course objectives 
are usually phrased in behavioral terms, though the 
behavior specified is an amalgamation of specific behav- 
iors . 

D. (Sample Learning Objective) 

Students will be able to use the irregular veib faire 
, with 80 percent accuracy in idiomatic express.' ons to 
indicate it is warm, cool, cold, windy, and nice in 
Qral responses to questions. 

Learning objectives are very specific. They describe 
behavior that students will exhibit within one class 
^ period or perhaps a few days. They indicate student 

behavior in regard to particular structures and vocabulary^ 
It is important, however, that each specific learning 
objective be a particle of the overall goal, - Too often, 
they are created with no consideration of the overall 
purpose of the program. 

Thus, the goal refinement process must be viewed (and 
must be judged) as a process of progressive differentia- 
tion. Goal statements provide direction for a program 
and engender specific objectives that guide moment-to- 
r^ment classroom behavior". Any given classroom technique 




or strategy can only be judged by reference to an objec- 
tive. One cannot ask in the abstract, for example, 
whejt;her a translation is good or bad or whether a partic- 
ular' drill should be" lo' er or shorter. Such questions 
can be answered only witnin the context of the initial • * 
learning objective, . 

The content of the. goals and objectives is never easily 
decided. Such decisions are always value judgments made 
about inherently complex matters. The needs and nature 
of society must be considered and interpreted. Moreover, 
it is a society of tomorrow that must be envisioned, for 
our students are going to spend the major part of their 
lives in the twenty- first century. 

The nature of the learner, including inherent limits on 
what can be learned in a given amount of time, must be 
considered. Language educators have in the recent past 
.been guilty of violating this fundamental requirement. 
In many programs we have specified unattainable goals. 
Some of us have unreal istically said, for example, that 
students would achieve what amounts to a form of bilin- 
gualism'by the end of four semesters of language study. 
We have also tended to expect unrealistic ability in the 
writing skill— particularly in consideration of its 
difficulty and importance, 

.The body of knowledge about the second language teaching/ 
learning process must be considered in formulating or 
judging goal statements. We. have, for example, become 
increasingly aware of the importance of genuine communica- 
tive or meaningful use of the language rather than purely 
manipulative \jse. In the past, too many programs had 
implicit— if not explicit— goals that stopped short of 
communicative *use. As a result, a large number of stu- 
dents were very skilled in <doi^)g all forms, of drill and 
exercise but coul^d not ask directions to a hotel or read 
a menu* The transfer from drills and exercise did not 
occur automati'*Hlly. One "of the important contrasts . 
between today's goals and those of a decade ago relates 
to this recognition. Many evaluators today find goal 
statements that are slightly anachronistic because of , 
this excessive emphasis upon rote leararlng and drill. 

A growing number of goal statements reflect current in ter- 
sest in non-li?inguage skill outcomes. Gross- cultural 




understanding, insights into the communication process, 
adaptability in the face of the unfamiliar, and mental - 
dexterity are increasitigly specified as aims of the study 
of a second language. 

Guideline Quest j:Ons for Judging the Statement of Aims 

Are the statements of aims clear and well defined? 

Do the specific learning objectives have a direct 
and clear link to the broad goal statements? 

Are the goals realistic? Are the objectives appro- 
priate for the level of instruction? > 

Are the goals consistent^ with the nee'ds and interests 
of the students? Of the s\aff? Of the governing 
bodies of the school? Of the community or area? Of 
the contemporary world? 

Are the goals consistent with learning theory? Are 
they organized and sequenced fot efficiency? 

Are the goals compatible (e.g., not emphasizing both 
accuracy of pronunciation and ability to coiranUi.icate 
from the first days of instruction)? 



Selecting Instruments, Collecting Data, 
and Analyzing Results 

In many phases of an evaluation it is necessary to collect 
information from relatively large numbers of persons 
(e.g., students in the program, other students, community 
.members, faculty members). It is usually most efficient to 
do this with measurement instruments (tests, questionnaires, 
etc.). It follows, therefore, that the quality of these 
instruments must be carefully considered. The wise selec- 
tion (or creation) of instruments, the careful collection 
of data, and correct choice of procedures for analyzing the 
data are essential to a high quality evaluation. A plan 
for all .data .collection and analysis should be made prior 
to starting an evaluation. This plan will assure that all 
information relevant to the quality of a program. will be 



collected* It should be reemphasi^ed that oRly after 
complete aiid accurate descriptive information has been ' 
collected can judgments be made about the worth of the 
program* 

Using Existing Information 

Before selecting or constructing instruments to obtain 
the desired data, the evaluator should determine whether 
or not the information might already be available from 
other sources such as instructors^ records , school records 
and files, counselors* or advisors* files, records of 
other personnel (e,g^, admissions office, extra-curricujar 
activity advisor, coaches) ,^ or records held by parents or 
students themselves ^ In many situations, the above sources 
will be of limited value "because the information may not ^ 
directly address the questions being asked in the evalua- 
tion* An evaluator may, 'for example, want to determine 
what percentage of students have studied a second foreign 
language, and the high school records may not indicate 
languages studied in junior high schools Another concern 
is the accuracy of the data* Because the evaluator has 
had no control over the way in which the information was 
collected, its reliability may be questionable and the 
data therefore unusable* 



Selecting Instruments 

When the information needed cannot be obtained from avail- 
able records, it is necessary to select or construct 
instruments to obtain the data* Hie most important con- 
sideration in selecting instruments is whether they 
nieasure adequately what they are* intended to measure* If 
a test is intended to measure, proficiency in the target 
language, do^s it indeed measure the appropriate language 
skills, does it use the important vocabulary and struc- 
tures, and is it at an appropriate level of difficulty? 
If the instrument is a questionnaire to determine student 
backgrounds or attitudes toward language study and toward 
the instruction., is it written with clarity and at an 
appropriate level of sophistication? (For instance, high 
s^qhool studfcnts will not understand terms like **ethno* 
centrism*^^) Other ..considerjit ions are 
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1. Reliability: If the test could be r^-administered 
(after all the effects of .having responded once 
Ifefore had been magically erased from the minds 
of those resp^onding) , would it yield similar 
results? Common threats to reliability from 
within the instrument are ambiguous questions 

and poor quality reproduction, 

2. Ease of use: Is the instrument relatively easy 
to administer and summarize or score? 

3. Time and resources: Can the instrument be used 
within the time blocks available (e.g., a class 
period)? Does it have to be purchased?. If so, 
is the cost reasonable for the benefit to be 
gained? 

4. Population: Is the instrument appropriate for 
the group with which it will be used? 

5. Form of the resulting data: Does the instrument 
yield objective data? If not, is there provision 
for maintaining the quaUty of subjectively 
collected data? If several persons will collect 
the data, is there control for the differences? 



of Instruments 



It is impossible in a paper such as this to provide great 
detail about types of instruments. This is one area . 
where many language teachers may wish' to consult measure- 
ment experts or psychometric textbooks, if a moraber of . 
the staff is not already familiar with use of the partic- 
ular instrument type. 

Once a decision is made as to what is to be assessed 
(achievement, performance, attitudes, interaction among 
persons), the type of instrument c^n be determined. One 
useful ijay for categorizing instruments is in terms of 
four broad types (TenBrink, 1974): observation, inquiry, 
analysis, and testing. 

Observation may include such items as anecdotal records, 
checklists, rating scales, and rankings. These methods 



jnay or may not bo very objective^ The quality depend^ on 
how well thfe criteria are defined for each instruments 

The category of inquiry includes questionr;iires, inter- 
views, and--for certain types of evaluation-^-sociometric 
instruments and projective techniques* Quest ioanaires 
and interviews are particularly useful for identifying 
attitudes held by language students— attitudes tliat are* 
pertinent to the quality of th^ progranu 

Some language educators have successfully used the Foreign 
Language Att i tujde Questionnaire (FLAQ) (Jakobovits, 1970) , 
1 1 is int ended for al 1 ages a nd has two forms— one designed 
for students who are studying or have studied a foreign 
language and the other for those who have not studied a 
language* The questionnaire asks about the students^ 
language background, their attitudes toward foreign lan- 
guage study, and their reasons for study* The format is 
inultiple-choice with an opportunity for additional com- 
ments ♦ It may be most useful as an example or models for 
most language educators would be most satisfied with 
construction of an instrument designed for their own 
particular curriculum* 

Analysis, as defined by TenBrink, includes the techniques 
of content analysis and interaction analysis ♦ Content 
analysis is a counting procedure, whereby written or 
spoken communications are analyzed for the presence or 
absence of certain characteristics (1974:146)* Interac- 
tion analysis is useful for obtaining information on 
group participation, individual student interaction with 
the group, and instructor interaction with the class ♦ 
It can provide descriptive information about the percent- 
age of time spent by the instructor and students using 
the target language versus using the native language, or 
teacher talk versus student talk, or any other categories 
of behavior that are deemed relevant. [A more jex tensive 
jJiscussiort of these systems appears in Appendix B^] 

The fourth category, testing, is the one most often used 
for evaluation purposes. Achievement* tests and diagnos-* 
tie/prognostic tests are available commercially. Diag**^ 
nostic tests include the Modern Language Aptitude Test 
(MLAT) (Carroll and Sapon, 1959), the Elementary MLAT 
(Carroll and Sapon, 1967) and the Pimsleur Language Apti^ 



18 



tude Battery (lAB) (Pimsleur, 1966)* The ^ILAT has been 
widely used' for many years* It is designed for English- 
speaking persons from grade 9 through adult ♦ Tt has two 
foms— a long form requiring 60-7S minutes to administer, 
and a" short form requiring approximately 40 minutes. The 
short form simply omits parts 1 and 2 of the long form 
(and has nearly as good validity) . The total test has 5 
parts: 

1» Number learning: Students learn and then are 
tested on numbers in^a new language, 

• 2, Phonetic script: ■ Students learn sound-symbol 

correspondences and are then asked to select the . 
correct transcription for spoken words. 

3. Spelling Clues: Students select synonyms for 
. coded English words. 

4. Words in sentences: Students must manipulate 
various grammatiCjal concepts without the useibf 
any grammatical r terminology. 

5. Paire'd associates: Students mcmoritie vocabulary 
in a new language. 

»The Elementary .Modern Language Aptittide Test is similar 
to the MLAT but has tasks that have been simplified to 
make it appropriate for younger students down to grade 3. 

. It requires abQut 60 minutes to administer. (Tliere is 
no short form.) . . . 

The Pimsleur LAB takes approximately 60 minutes. It is 
intended for English-speaking students in grades 6-12. 
It consists of six parts, which the test defii^es as the 
comiwnents of' aptitude: Grade-Point* Average, Interest, 
Vocabulary, Language Analysis, Sound Discrimination, and 
.Sound-Symbol Association. 

The principal role\for aptitude tests in a language \:1 ass- 
room is as a diagnostic device. In an evaluation, an 
aptitude test can provide information about the student 
population and therefore about the match between student 
characteristics and the type of curriculum. 
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Achievement tests, ir^clude the Coimiion Goncopts Foreign Lan- - 
gnage Test (California Test Bureau^ 1962)^ the MLA Coop- 
erativ-e Foreign Language Tests (Educational Testing 
Service^ 1965) » and the Pimsleur Modern Foreign Language 
Proficiency Tests (Pimsleur^ 1967). 

Appropriate for all grades^ the Common Concepts Foreign 
Language Test is about 40 minutes in length and tests 
listening comprehension in French^ German^ and Spanish^ 
Students hear sentences in the foreign language and indi- 
cate their understanding of#what they have heard by 
selecting from sets of four colored pictures the ones 
that have been described* 

The MLA Cooperative Foreign Language Tests ^French ^ Ger- 
man^ Italian, t^ussian, and Spanish) measure listenings 
speaking, reading, and writing skills* One form is 
appropriate for levels one and two, the other for levels 
three and four* ^ 

The Pimsleur Modern Foreign Language Proficiency Tests 
are available in French, Gernian, and Spanish for levels 
one and two* They also test the four skills* 

Care must be exercised when choosing standardized tests* 
Klein (1971) points out that standardized tests often 
have such limitations* as questionable vilidity^ poor 
overlap between program and test objectives ^ inappropri- 
ate instructions and directions, and confusing designs 
and formats* The overlap between vocabulary used on 
these tests and^ vocabulary taught in most U*S* language 
programs is discouragingly small. In most situations, a 
language staff can create an instrument that provides 
data that are more tru^stworthy than the results from any 
cojmnercially available test* Test construction, however, 
requires considerable time, effort, and' expertise* If a 
test is to be constructed, the procedures should reflect 
standard procedures used by professional t€i»t makers. It 
should, first of all, incorporate the best judgment of 
all faculty members. This test should then be given 
experimentally to a group of students and should be 
revised on the basis of the way the test functions* Only 
then is it ready for use in the evaluation. , , 

It must be remembered that data from all instruments are 
simply a description of outcomes and characteristics of 
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the ptogram.and its participants. The choice of instru- 
ments is therefore synonymous with the question. What 
kinds of information will be necessary in order to make 
judgments about the quality of the program? 

/ 

Collecting the Data w . ♦ 

After the appropriate instruments have been created or 
selected for obtaining the desired information, the pro- 
cedures for collecting the data must be determined* 
Collecting data is an important step requiring carefvl 
planning. It can involve delicate interpersonal relations, 
because its success often depends on the cooperation and 
assistance of persons who are not directly connected with 
the evaluation. If a "data collector" inadvertently 
makes a mistake, it could affect the Validity of the data. 
Thus, >careful training and specific instructions are 
essential considerations for an evaluator* Law and Bron* 
son (1977) suggest the following steps: 

1, Make the necessary arrangements with the school (s) 
and the personnel who "are to be involved. 

2. Decide who will collect the data. 



Make the arrangements for any t.raining that is 
needed. 



4. Schedule the collection within thg time allotted 
for this phase of the evaluation. 

5. Monitor the total data collection process. 

An important consideration is whether tbe data collection 
is to b# planned and carried out by someone within the 
program or by soiSlone from outside. If it is done by 
personnel vwithin the program, they should be aware of two 
potential sources of bias, Fir^t, they may have a vested 
interest" in the program and may focus excessive attention 
on its successful aspects while de-emphasizing trouble- 
some elements. Second, they may be so conscious of and 
.attentive to the program objectives that they overlook 
secondary effects that are equally important. 
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Analysing tffb Data 

It would be" very convenient i*f data that have been col- 
lec.te4 could be fed into a c(^puter, which would then feed 
back a printout with ^oth a Nummary and an interpretation^ 
of the* information. Unfortunately, neither the summari- 
2atiori nor the subsequent judgment is that simple. 

,Svaa^i^y of data requires many decisions. Counting fre- 
^quencie$ ,and percentages is mechanical] however, catego- 
ries may be modified (especially by being combined or 
eliminated)* One may, for example, elect to combine 
."agreeV and "strongly agree" categories in a scale because 
few respondents have chosen the strongly agree or strongly 
disagree categories. It -may also be important to empha- 
size or highlight certain asjftects of the data, IVhen this 
is done for a yalid purpose; i|. is of course justified. 
One must recognize, however, that there is a very fine 
line between empKasis and distortion.. 

It may be useful with some types of data to ask whether a ^ 
patteri^ or diff erence,-is due to particular characteristics 
or features of a program or due to chance variation. 
Tests of statistical significance (chosen by someone with 
relevant competence) will provide that answer, 

A distinction must be mad*^ between statistical signifi- 
cance and practical significance. To determine statisti- 
cal significance, one calculates the probability^ that a 
given event could have occurred bV^ chance, alone. If the 
probability of occurrence by chapCe is small, the 
researcher concludes that the results are due to non-chance 
factors or .to the condition or program under investigation. 

Statistical significance is sensitive to the number of 
people, items, or events involved. The larger the sample 
size, the more likely one will be able to rule .out chance^ 
- as accounting for a particular difference or pattern. 
Statistical significance i^ only pa^t of the total pic- 
ture. Because important decisions must be made on the 
basis of an evaluation, it is often necessary to show 
practical significance as well. Program evaluators might 
have detexroined, for example, that a group of students 
who used the language laboratory were superior in listen- 
^ing skill to other students who experienced the same 



instruction but who did not use the laboratory. Let us 
assume that with the help of a statistician it has been 
dftermined that the few points of advantage for the lab- 
' oratory students are statistically significant. It is 
entirely possible, hovfever, that the staff (or adminis- 
trators) might judge that the slightly better listening 
ability is not worth the high cost in dollars of main-, 
taining the lab. 

Thus, value judgments must be made "on the bottom line" 
regardless of the sophistication of the data. The numbers 
- function to show variation, but they cannot tell whether 
or not the variation is desirable. The evaluators must 
look at the numbers and then judge whether they are "good"; 
or ^♦bad." Such value judgments can never be made in the 
abstract. 

Evaluating Classroom Behavior 

The mass of classroom behavior by instructors and students 
is extremely large, diverse, and complex. There is 
extensive evidence that we do not understand well the 
phenomena involved. Research on how people learn in gen- 
eral and how they learn foreign languages in particular 
is in its infancy. Teaching effectiveness is the most 
researched area' in all of education, and'yet there are 
pitifully fow generalizations one can make about factors 
that contribute universally to success. In order to 
evaluate, one must either resort to a* single global 
impression of the quality of the instruction or break the * 
many dimensions of the behavior into manageable units. 
Yet, at the same time, each small unit or segment of 
activity must not be separated from the context ii. which 
. it occurs. Out of context it becomes uninterpretablfe. 
. To ask, for example, Whether or not a particular homework 
asslgranont was apjpropriate requires consideration of many 
factors or variables. What is the overall objective? 
, What is the specific purpose of the activity? What pre- 
ceded it? What will follow it? What is the level of 
instruction? What are the characteristics of the 1 earn- 
er (s)? What is the instructor's teaching style? Suc.^ 
list of questions could go on for pages, but neverthe'^i : 
» must be asked intui^tively whenever one is judging any. 
segment of classroom behavior or materials. 
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As demandini} and difficult as the evaluation of instruc- 
tion is, it must be judged. It is in the classroom that 
the impact of all variables becomes real. As with other 
phases of evaluation, the first stage requires an accurate 
perception and description of the pertinent phenomena^ 
Only then can the second stage-^judg!i)ent of worth--- take 
place. As discussed elsewhere in this paper, behavioral 
observation systems (and other instruments) a\e one means 
of increasing the objectivity of the description* (There 
remains, of course, subjectivity in the choice of catego- 
ries of behavior that are to be observed.) It is not 
possible, however, to observe systematically all dimen^ 
sions of instruction* Frequently, the most defensible 
evaluation is a rating of a dimension of the instructional 
activity and materials* On the basis of many criteria 
which fall under the rubric of '^experience and knowledge,*' 
the evaluator observes carefully and dispassiqnateiy and 
then makes a judgment* 

Listed below is a series of , questions designed to guicje 
obs<irvation and judgment of instructional behavior* and 
materials, Tbere are no single "right" or "best" answers 
to them, and the list is not, all-inclusive.. For some of 
the quc^stions there .would be widespread agreement on a 
most -desirable response; for others, there would be little 
agreement J, for still others, the most liesirable i-esponse 
depends^ on the particular teaching/ learning si. tion. 
Each draws attention to a particular asjpect of insti^c- 
tion so that the evaluator can make a judgment about i);. 
Each is also conducive to a "why" or "why not" follow-up 
question. " 

.A. The Student Population 

1. Is the' program open to all students, or is it 

, restricted to certain Jtypes or categories? Are 
, the restrictions valid? 

2, Does the program attract a sufficient proportion v 
of those students who, are eligible to enroll? 

' 3. Are students well informed about the nature of 
the program, its goals, and the benefits of lan- 
guage study? 
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Program Scope > 

l» Is tlte scheduling of language classes consistent 
with jthe program goals and the interests of the 
faculty? 

2. Can students move comfortably from one course or 
one level to the next? 

3. Is the length (number, of courses or years) of 

the program appropriate? Is the breadth of offer- 
ings adequate? 

4. Is the overall sequence of content logical? Is 
it consistent with, the program objectives and 
today's knowledge? 

Administrative Considerations 

1. Is "there good communication with the rest of the 
school or university^ With those responsible for 
the advising and counseling of stiadents? With 
the administration? With other faculty members? 

2. Is the ratio of students to faculty acceptable? 

3. Is there an adequate amount of extracurriculaY 
activity relating to language study? 

4. Are community resources utilized in the program 
(e.g., native speakers, companies with interna- 
tionial contacts, museums, etc.)^ 

Methodology^/Classroom Activities ' 

. 1. Is the methodology consistent with the goals of 
the program? With the specific objectives of any 
particular moment? 

2. Is the methodojogy logically consistent? 

3. Is the moment-to-moment sequence of activities 
logical (e.g., simple to complex, beginning with" 

•the known, leading to genuine coimnunicative 
ability)? 
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4* Are the steps from one activity or segment to the 
next of the right si2e (i^e,, do they challenge 
- the students whil$ not being too difficult)? * 

Are sufficient examples and models given? 

6* Are presentations clear and interesting? 

7* Is the classroom pace appropriate? 

8* Are correction strategies and behavior wisely 
used? 

9* Is there adequate opportunity for practice of the 
language? 

10^. Do srjdents work individually , in small groups, 
or in large groups according to the nature of the 
task and its purpose? ^ 

TeacheT/Student Behavior 

I, Are the instructors fully aware of the goals? 

..2. Is the teacher role consistent with the philospphy 
of the school or department? Are instructors 
comfortable in their role? 

3, Do instructors have an opportunity to interact 
with colleagues? To visit the classes of other 
instructors? 

4. Is the moment- to-moment teaching behavior valid? 

$• Do the instructors consider student characteris- 
tics such as age*, aptitude, motivation, and ' 
iniere&ts in all their interaction with them? 

6. If (Student attitudes have not been surveyed by 
fomal instruments, does their behavior reveal 
their overall attitudes toward the program? 
Toward specific components of the program? Have 
thebe 'attitudes been taken into account in improv- 
ing} the, program? 
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7. Is feedback given to students about their per- . 
formance? Does it take into account individual 
differencest 

Testing * . 

1. Is testing valid in terms of 'program goals? Does 
it measure genuine communicative or meaningful 
use of the language? 

2. Does testing use efficient established procedures 
and it,em types? 

3. Does the testing have face validity (e.g., do 
- students perceive it as fair)? 

Materials 

1. Are the instructional materials consistent with 
the goals? Is the basic text supplemented by 
readers, workbooks, or other materials? 

2. Are audiovisual aids available and used effec- 'N 
t*vely? 

* f 

\ 

3. Are the materiVls authentic? Do tHex create 
• intercultural awareness without reinforcing 

stereotypes? Do they represent all segments of 
society and all cultures where the target language 
is spoken? . ' 

4. Are teacher-made materials of good quality? 
Facilities 

1. Are the facilities adequate? Do they meet basic 
needs (e.g., space, light, ventilatioti, etc.)? ; 
Are they conducive to teaching and learning? 

2. Is the library adequate? ^ Does the staff have a 
professional library? 

3. Are there adequate support services (e.g., cler- 
ical, aides, etc.)? 



Honest and accurate answers 'to such t^uestions form the 
basis for judgments about the value or worth of ins true- 
tion?.! components^ They are probably the most important 
evaluative activity in judging ,^he process (as opposed to 
the products) of education. 

Writing the Evaluation Report 

The preparation of a final document that communicates to 
all concerned the findings of the evaluation is an impor- 
tant last step in the evaluation process^ Perceptions 
' presented orally are very susceptible to varied interpre- 
tation* Furthermore^ considerable clarification of the 
^.deas occurs during the preparation of the document* 

The format is a matter of preference and style. The 
, content, will vary depending upon the local conditions* 
In one way or another, it will include what was observed 
and any recommendations, that may be made. In its simplest > 
form, where there has been a self-study, it will merely 
endorse the s el f-s tudy. 

The following outline reflects one possible "table of 
contents^* for an evaluation repor,t. 

1, Objectives of the Evaluation 

A, Rationale for evaluating 
B* Audience 

Decisions that may be anticipated 

11 ♦ Description of the Program 

A* Educational philosophy 
B« Goals and objectives 
Staff 

Instructional procedures /methodology ^ 

Content 
F* Student population 
G. ^ Community settings 

Facilities 

III* Program Outcomes 

Student achievement 
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B. Attitudes 

C. Side effects 
^ D. Costs 

IV. Mjudgments about Program 

A. Value of oatcomes ' 

B. Strengths 

C. Weaknesses 

D. Recommendations ■ 



Conclusion 

Obviously, any evaluation report that is simply filed 
after being completed has demanded far' more effort and 
time thj^in can be justified. An evaluation that is util- 
ized to improve instruction is not only beneficial to- 
everyone i^ssociated with a program but is also very satis- 
fying to those who participarted. 

There is a critical need to do more evaluation of our 
language programs. It is especially important that we 
conduct evaluations at "non-crisis" times— times when our 
programs are not under -direct attack. Such evaluations 
should be solely for the purpose of improving the programs 

The roost useful evaluations, moreover, are those that are 
designed and planned at the local level. Such evaluations 
benefit from general guidance, but because each situation 
is" unique, precise specifications cannot be made. It has 
been the intent of this paper to provide the guidance 
needed for successful evaluations. One source of consola- 
tion for anyone facing the, many decisions that have to be 
made in doing an evaluation is the fact that there are - 
al'i^ays multiple ^"good" dfecistons. Evaluation is not an • 
activity in which one must follow a prescribed set of 
procedures. It* comprises, in the final analysis, all 
activities in which one carefully gathers information and 
then makes judgments of worth or quality. This model^ is 
well suited to all professional activity. 
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Principal Evaluation Models 



Because, of the complexity of educational evaluation, it 
is. not surprising that many evaluation experts have 
turned to the use of symbolic models to clarify evaluation. . 
'Like all symbolic models, they achieve manageability by 
reducing the mass of behavior involved in evaluating to a 
series i>f abstractions. The niost frequently discussed 
(and presumably utilized) models, as identified by House 
(1978) and as described throughout the professional" lit- 
erature, are 



1. Systems Analysis 

One begins with measures of results of the program 
(output measures) and attempts to relate them to char- 
acteristics of the program* One asks whether or not 
all dimensions and components of the system are func- 
tioning effectively. Variations in the* results of the 
program (e.g., student learning or enrollments) are 
traced to "changes in the program characteristics (e.g., 
materials used, methodology, class time). 

2. 3&hQx>ioral Oh^&otive Attairwient 

The objectives of a program are delineated in very 
specific terms of student performance. To determine 
the extent to which these objectives are being attained, 
students are tested, and the results are compared with 
the objectives. The extent to Which student perfor- 
• mance matches that specified in the objectives is an . 
index of the program quality. 



3. deoisibn^Making Model 

One begins with the decisions that are to be made on 
the basis of the evaluation. These decisions guide the 
nature of the«> evaluation. The evaluation supplies 
information that is relevant to the decision. Thus, ,if 
one were to make decisions about the student population 
that was to study a language, variables such as age, 
grade or year, language aptitude, college-bound or not, 
major, and other grades might receive special attention. 

4, " Goal-fTee Epaluation 

In an effort to reduce the effects of bias in evalua- 
tion, the intents of the program developers or decision 
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makers are not revealed to the ^valuator. The evalu- 
ator must therefore search for all outcomes of the 
"program. The results of such evaluation can be used by 
the developers to improve the program and by consumers 
(students) to accept or reject it. In some ways this 
model is similar to the Consumer Reports approach to 
evaluation. 

5. Art CvitiQism^Model - 

The educational critic operates in the same manner and 
tradition as the art or literary critic. A m^jor. 
assumption is that the evaluator has become skilled by 
"his or her training and experience to judge the impor- 
tant aspects of a program. This model is frequently 
used in second language programs. Usually, an expert 
such as a foreign language education professor is 
brought in from the outside to spend several days on 
campus or at the school to make judgments about the 
program. Much of this person's expertise lies in how . 
well he or she can intuitively implement principles of 
judgment. 

6. Accreditation 

This frequently used form of evaluation usually involves 
visits by teams of colleagues from other schools. 
Principles of judgment are Usually spelled out in 
checklists of evaluative criteria. The local statf 
have previously collected and analyzed information in 
an extensive self-study. In many ways, the visiters 
function as a conscience for the self-study. 

House also identifies as other major models the Adversary 
Model, in which the pros and cons of a program are argued 
in a manner not unlike a trial by jury, and the Transac- 
tion Model, in which the educational processes themselves 
are studied. Neither of these models seems to be in 
widespread use in foreign language education. 



Appendix B 
Observation Instruments 

Observation instruments can be very useful in describing 
cf»&sroora behavior. Essentially, they consist of a system 
of categories of behavior into which. instances of the par- 
ticular behaviors are classified as they take place. The 
instrument may be used to record all instances of the cate- 
gories or a sample of tliem. Sampling is usually done via a 
time sample in which occurrences of behaviors are recorded 
at a regular interval (e.g., every 10 seconds or every IS 
seconds) . The behaviors are then summarized in such forms 
as matrices, percentages, or simple frequency counts. 

Observation instruments can be very useful in gathering 
accurate information about what occurs in classrooms. The 
technique probably, however, holds more potential than has 
been realized so far. Instructors should therefore feel 
free to modify existing system or to create their own to 
meet their own needs. 

There are several instruments available for coding class- 
room behavior. Flanders (1960) developed one of the first 
and most popular instruments. The Flanders Interaction 
Analysis Categories (FIAC) contain seven categories for 
coding teacher verbal behavior and three categories for 
judging pupil verbal behavior: 

TEACHER TALK 



. 1, Aaeepts feeling: accepts and clarifies the feeling 
» tone of the students in a non- threatening manner. 
Feelings may be positive or negative. Predicting 
and recalling feelings are included. 

§-2. Pmioaa or enaourag^s: praises or encourages stu- 
^ dent action or behavior. Jokes that release ten- 
l;^ sion, not at the expense of another individual, 
,^ nodding head or saying "uh huh?" or *»go on»' are 
included* 



g 3. Aoa&pts or* uses ideas of student: clarifying, 
^ building, or developing ideas or suggestions by a 
I? student. As teacher brings more of his own ideas 
into play, shift to category five. 

4, Askn questions: asking a question about content or 
procedure with the intent that a student answers. 



ERIC ^ 



32 

36 



5. Leotux^&B: giving facts nt opinions about content 
or procedure; expressing his own ideas; asking 
,| rlietorical questions, 

a 6, Gives dir^eoHons : directions, commands, or oVders 
with which a student is expected to comply* 

7. CHtioizes ot juQtifiss authority: statements 

intended to change sijudent behavior from nonaccep- 
* table to acceptable pattern; bawling someone out; 
stat;ing why the teacher is -doing what he is doing, 
extreme self-reference, ^ 



STUDENT TALK 

8^ Student talk^pespome: talk by studenta in response 
to teacher* Teacher initiates the contact or soli- 
cits student statement* ' " \ 

9, Student talk^initiation: talk by students, which 
they initiate. If ^'calling on^^ student is only 
to indicate who may talk next, observer must decide 
whether student wanted to talk* If he did, use this 
category* 



10* Sileme op oonfusioji: pauses ^ short periods- of 

silence, and periods of confusion in which communi- 
cation cannot be understood by the observer. 



The FIAC has been modified and extended by various 
researchers in an attempt to adapt the instrument to 
varying philosophies and subject matter areas, Moskowits: 
(1968) created an adaptation of the FIAC for the foreign 
language classroom* Her Foreign Language interaction 
system (FLint)^ includes, in addition to Flanders* cat ego- ^ 
ries, the following itCTis: 

1* the teacher 
Jokes 

-repeats student ideas verbatim 
^ -corrgi:ts without criticism 
-diretjts a. pattern drill 
-criticises student behavior 
-criticizes student responses 
2* silence 
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3» confusion 
4, laughter 
^ * 5, English 

Moskowit2 also added a foreign language I/D ratio 
(indirect/direct), an English I/D ratio, and the F/E ratio 
/ (ratio of foreign language to English) for the total 
lesson. 



Jar^i-s C19<:3) also developed a system for, observing 
foreign language classroom behavior, his instrument 
classifies behaviors in terms of language skill accjuisi- 
tion consequences of the behaviors, TJie instrument dis- 
tinguishes between "real," meaningful language use and 
drill activity. 

TEACHER STUDENT " 

TARGET LANGUAGE 
A Evoking student response I Evoking response 
B Evoked by^student 2 Rei*.pondiiig 

REAL C Classroom management 

, D Facilitating performance 
or reinforcing behavior 
B Information explanation 

G Evoking s,timulus S Individual response 

DRILL tJ Repetition reinforcement 4 Choral response 
J Prompting 

P Modeling or correcting 



READING W Presenting written 5 Writing 

AND language S Reading silently 

WRITIiNG 7 Reading aloud 

ENGLISH 

K About target structure 3 Question about target 

or sound systan 3 Answer about target 

M About meaning 
N Management 

+ Silence or English not in the above categories but 

which seems to facilitate learning 
« Silence or English not in the above categories but 

which seems to impede learning 



At regular intervals of a predeterjnined number of seconds 
the observer records', a letter or a number, depending on 
the behavior occurring at that particular instant. 

Many types of observational instruments have been devel- 
oped since Flanders. (See for example Grit tner, 1969 
and Wragg, 1970), The two described here are mentioned 
because they have been developed specifically for the 
foreign language classroom. 

Observation instruments have several advantages: ^ 

1. They can provide valid and reliable information on 
certain classroom behaviors when the classroom is 
observed several times. 

2. They are adapted to a variety of tasks, settings, and 
individuals at all educational levels, 

3. They can provide a valuable supplement to achievement 
data. 

4. They can provide both qualitative and quantitative 
data . 

Caution must be exercised, however, whenevei 6ne plans 
on using an observational technique. A long period of 
training and experience may be required for the observer. 
In addition, many activities take place simultaneously 
in a classroom, and it is often difficult to record 
behaviors that are significant. Interpretation of obser- 
vational findings must take into account the context and 
must not generalize from a very restricted sampling of 
behavior. 
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